A Maximum-Entropy-Inspired Parser
نویسنده
چکیده
We present a new parser for parsing down to Penn tree-bank style parse trees that achieves 90.1% average precision/recall for sentences of length 40 and less, and 89.5% for sentences of length 100 and less when trMned and tested on the previously established [5,9,10,15,17] "standard" sections of the Wall Street Journal treebank. This represents a 13% decrease in error rate over the best single-parser results on this corpus [9]. The major technical innovation is tire use of a "ma~ximum-entropy-inspired" model for conditioning and smoothing that let us successfully to test and combine many different conditioning events. We also present some partial results showing the effects of different conditioning information, including a surprising 2% improvement due to guessing the lexical head's pre-terminal before guessing the lexical head.
منابع مشابه
A maximum entropy semantic parser using word classes
This paper describes the parser that is used in the Sail Labs Conversational System, which is a spoken dialog system. This parser is a fully statistical, semantic parser. The probability model of the parser is based on the principle of maximum entropy. The maximum entropy framework allows to combine the available information in a fully automatic way, but the training of maximum entropy models i...
متن کاملA Linear Observed Time Statistical Parser Based on Maximum Entropy Models
This paper presents a statistical parser for natural language that obtains a parsing accuracy—roughly 87% precision and 86% recall—which surpasses the best previously published results on the Wall St. Journal domain. The parser itself requires very little human intervention, since the information it uses to make parsing decisions is specified in a concise and simple manner, and is combined in a...
متن کاملUsing a maximum entropy-based tagger to improve a very fast vine parser
In this short paper, an off-the-shelf maximum entropy-based POS-tagger is used as a partial parser to improve the accuracy of an extremely fast linear time dependency parser that provides state-of-the-art results in multilingual unlabeled POS sequence parsing.
متن کاملA Maximum-Entropy Partial Parser for Unrestricted Text
This paper describes a partial parser that assigns syntactic structures to sequences of partof-speech tags. The program uses the maximum entropy parameter estimation method, which allows a flexible combination of different knowledge sources: the hierarchical structure, parts of speech and phrasal categories. In effect, the parser goes beyond simple bracketing and recognises even fairly complex ...
متن کاملA Maximum Entropy Chinese Character-Based Parser
The paper presents a maximum entropy Chinese character-based parser trained on the Chinese Treebank (“CTB” henceforth). Word-based parse trees in CTB are first converted into characterbased trees, where word-level part-ofspeech (POS) tags become constituent labels and character-level tags are derived from word-level POS tags. A maximum entropy parser is then trained on the character-based corpu...
متن کامل